45 research outputs found

    Synthesis of Randomized Accuracy-Aware Map-Fold Programs

    Get PDF
    We present Syndy, a technique for automatically synthesizing randomized map/fold computations that trade accuracy for performance. Given a specification of a fully accurate computation, Syndy automatically synthesizes approximate implementations of map and fold tasks, explores the approximate computation space that these approximations induce, and derives an accuracy versus performance tradeoff curve that characterizes the explored space. Each point on the curve corresponds to an approximate randomized program configuration that realizes the probabilistic error and time bounds associated with that point

    Probabilistic and Statistical Analysis of Perforated Patterns

    Get PDF
    We present a new foundation for the analysis and transformation of computer programs.Standard approaches involve the use of logical reasoning to prove that the applied transformation does not change the observable semantics of the program. Our approach, in contrast, uses probabilistic and statistical reasoning to justify the application of transformations that may change, within probabilistic bounds, the result that the program produces. Loop perforation transforms loops to execute fewer iterations. We show how to use our basic approach to justify the application of loop perforation to a set of computational patterns. Empirical results from computations drawn from the PARSEC benchmark suite demonstrate that these computational patterns occur in practice. We also outline a specification methodology that enables the transformation of subcomputations and discuss how to automate the approach

    Automatic Parallelization With Statistical Accuracy Bounds

    Get PDF
    Traditional parallelizing compilers are designed to generate parallel programs that produce identical outputs as the original sequential program. The difficulty of performing the program analysis required to satisfy this goal and the restricted space of possible target parallel programs have both posed significant obstacles to the development of effective parallelizing compilers. The QuickStep compiler is instead designed to generate parallel programs that satisfy statistical accuracy guarantees. The freedom to generate parallel programs whose output may differ (within statistical accuracy bounds) from the output of the sequential program enables a dramatic simplification of the compiler and a significant expansion in the range of parallel programs that it can legally generate. QuickStep exploits this flexibility to take a fundamentally different approach from traditional parallelizing compilers. It applies a collection of transformations (loop parallelization, loop scheduling, synchronization introduction, and replication introduction) to generate a search space of parallel versions of the original sequential program. It then searches this space (prioritizing the parallelization of the most time-consuming loops in the application) to find a final parallelization that exhibits good parallel performance and satisfies the statistical accuracy guarantee. At each step in the search it performs a sequence of trial runs on representative inputs to examine the performance, accuracy, and memory accessing characteristics of the current generated parallel program. An analysis of these characteristics guides the steps the compiler takes as it explores the search space of parallel programs. Results from our benchmark set of applications show that QuickStep can automatically generate parallel programs with good performance and statistically accurate outputs. For two of the applications, the parallelization introduces noise into the output, but the noise remains within acceptable statistical bounds. The simplicity of the compilation strategy and the performance and statistical acceptability of the generated parallel programs demonstrate the advantages of the QuickStep approach

    Parallelizing Sequential Programs With Statistical Accuracy Tests

    Get PDF
    We present QuickStep, a novel system for parallelizing sequential programs. QuickStep deploys a set of parallelization transformations that together induce a search space of candidate parallel programs. Given a sequential program, representative inputs, and an accuracy requirement, QuickStep uses performance measurements, profiling information, and statistical accuracy tests on the outputs of candidate parallel programs to guide its search for a parallelizationthat maximizes performance while preserving acceptable accuracy. When the search completes, QuickStep produces an interactive report that summarizes the applied parallelization transformations, performance, and accuracy results for the automatically generated candidate parallel programs. In our envisioned usage scenarios, the developer examines this report to evaluate the acceptability of the final parallelization and to obtain insight into how the original sequential program responds to different parallelization strategies. Itis also possible for the developer (or even a user of the program who has no software development expertise whatsoever) to simply use the best parallelization out of the box without examining the report or further investigating the parallelization. Results from our benchmark set of applications show that QuickStep can automatically generate accurate and efficient parallel programs---the automatically generated parallel versions of five of our six benchmark applications run between 5.0 and 7.7 times faster on 8 cores than the original sequential versions. Moreover, a comparison with the Intel icc compiler highlights how QuickStep can effectively parallelize applications with features (such as the use of modern object-oriented programming constructs or desirable parallelizations with infrequent but acceptable data races) that place them inherently beyond the reach of standard approaches

    GAS: Generating Fast and Accurate Surrogate Models for Autonomous Vehicle Systems

    Full text link
    Modern autonomous vehicle systems use complex perception and control components. These components can rapidly change during development of such systems, requiring constant re-testing. Unfortunately, high-fidelity simulations of these complex systems for evaluating vehicle safety are costly. The complexity also hinders the creation of less computationally intensive surrogate models. We present GAS, the first approach for creating surrogate models of complete (perception, control, and dynamics) autonomous vehicle systems containing complex perception and/or control components. GAS's two-stage approach first replaces complex perception components with a perception model. Then, GAS constructs a polynomial surrogate model of the complete vehicle system using Generalized Polynomial Chaos (GPC). We demonstrate the use of these surrogate models in two applications. First, we estimate the probability that the vehicle will enter an unsafe state over time. Second, we perform global sensitivity analysis of the vehicle system with respect to its state in a previous time step. GAS's approach also allows for reuse of the perception model when vehicle control and dynamics characteristics are altered during vehicle development, saving significant time. We consider five scenarios concerning crop management vehicles that must not crash into adjacent crops, self driving cars that must stay within their lane, and unmanned aircraft that must avoid collision. Each of the systems in these scenarios contain a complex perception or control component. Using GAS, we generate surrogate models for these systems, and evaluate the generated models in the applications described above. GAS's surrogate models provide an average speedup of 3.7×3.7\times for safe state probability estimation (minimum 2.1×2.1\times) and 1.4×1.4\times for sensitivity analysis (minimum 1.3×1.3\times), while still maintaining high accuracy

    Provable Defense Against Geometric Transformations

    Full text link
    Geometric image transformations that arise in the real world, such as scaling and rotation, have been shown to easily deceive deep neural networks (DNNs). Hence, training DNNs to be certifiably robust to these perturbations is critical. However, no prior work has been able to incorporate the objective of deterministic certified robustness against geometric transformations into the training procedure, as existing verifiers are exceedingly slow. To address these challenges, we propose the first provable defense for deterministic certified geometric robustness. Our framework leverages a novel GPU-optimized verifier that can certify images between 60×\times to 42,600×\times faster than existing geometric robustness verifiers, and thus unlike existing works, is fast enough for use in training. Our results across multiple datasets show that networks trained via our framework consistently achieve state-of-the-art deterministic certified geometric robustness and clean accuracy. Furthermore, for the first time, we verify the geometric robustness of a neural network for the challenging, real-world setting of autonomous driving

    Reasoning about Relaxed Programs

    Get PDF
    A number of approximate program transformations have recently emerged that enable transformed programs to trade accuracy of their results for increased performance by dynamically and nondeterministically modifying variables that control program execution. We call such transformed programs relaxed programs -- they have been extended with additional nondeterminism to relax their semantics and offer greater execution flexibility. We present programming language constructs for developing relaxed programs and proof rules for reasoning about properties of relaxed programs. Our proof rules enable programmers to directly specify and verify acceptability properties that characterize the desired correctness relationships between the values of variables in a program's original semantics (before transformation) and its relaxed semantics. Our proof rules also support the verification of safety properties (which characterize desirable properties involving values in individual executions). The rules are designed to support a reasoning approach in which the majority of the reasoning effort uses the original semantics. This effort is then reused to establish the desired properties of the program under the relaxed semantics. We have formalized the dynamic semantics of our target programming language and the proof rules in Coq, and verified that the proof rules are sound with respect to the dynamic semantics. Our Coq implementation enables developers to obtain fully machine checked verifications of their relaxed programs

    Managing performance vs. accuracy trade-offs with loop perforation

    Get PDF
    Many modern computations (such as video and audio encoders, Monte Carlo simulations, and machine learning algorithms) are designed to trade off accuracy in return for increased performance. To date, such computations typically use ad-hoc, domain-specific techniques developed specifically for the computation at hand. Loop perforation provides a general technique to trade accuracy for performance by transforming loops to execute a subset of their iterations. A criticality testing phase filters out critical loops (whose perforation produces unacceptable behavior) to identify tunable loops (whose perforation produces more efficient and still acceptably accurate computations). A perforation space exploration algorithm perforates combinations of tunable loops to find Pareto-optimal perforation policies. Our results indicate that, for a range of applications, this approach typically delivers performance increases of over a factor of two (and up to a factor of seven) while changing the result that the application produces by less than 10%

    Detecting and escaping infinite loops with jolt

    Get PDF
    25th European Conference, Lancaster, Uk, July 25-29, 2011 ProceedingsInfinite loops can make applications unresponsive. Potential problems include lost work or output, denied access to application functionality, and a lack of responses to urgent events. We present Jolt, a novel system for dynamically detecting and escaping infinite loops. At the user’s request, Jolt attaches to an application to monitor its progress. Specifically, Jolt records the program state at the start of each loop iteration. If two consecutive loop iterations produce the same state, Jolt reports to the user that the application is in an infinite loop. At the user’s option, Jolt can then transfer control to a statement following the loop, thereby allowing the application to escape the infinite loop and ideally continue its productive execution. The immediate goal is to enable the application to execute long enough to save any pending work, finish any in-progress computations, or respond to any urgent events. We evaluated Jolt by applying it to detect and escape eight infinite loops in five benchmark applications. Jolt was able to detect seven of the eight infinite loops (the eighth changes the state on every iteration). We also evaluated the effect of escaping an infinite loop as an alternative to terminating the application. In all of our benchmark applications, escaping an infinite loop produced a more useful output than terminating the application. Finally, we evaluated how well escaping from an infinite loop approximated the correction that the developers later made to the application. For two out of our eight loops, escaping the infinite loop produced the same output as the corrected version of the application

    Using Code Perforation to Improve Performance, Reduce Energy Consumption, and Respond to Failures

    Get PDF
    Many modern computations (such as video and audio encoders, Monte Carlo simulations, and machine learning algorithms) are designed to trade off accuracy in return for increased performance. To date, such computations typically use ad-hoc, domain-specific techniques developed specifically for the computation at hand. We present a new general technique, code perforation, for automatically augmenting existing computations with the capability of trading off accuracy in return for performance. In contrast to existing approaches, which typically require the manual development of new algorithms, our implemented SpeedPress compiler can automatically apply code perforation to existing computations with no developer intervention whatsoever. The result is a transformed computation that can respond almost immediately to a range of increased performancedemands while keeping any resulting output distortion within acceptable user-defined bounds. We have used SpeedPress to automatically apply code perforation to applications from the PARSEC benchmark suite. The results show that the transformed applications can run as much as two to three times faster than the original applications while distorting the output by less than 10%. Because the transformed applications can operate successfully at many points in the performance/accuracy tradeoff space, they can (dynamically and on demand) navigate the tradeoff space to either maximize performance subject to a given accuracy constraint, or maximize accuracy subject to a given performance constraint. We also demonstrate the SpeedGuard runtime system which uses code perforation to enable applications to automatically adapt to challenging execution environments such as multicore machines that suffer core failures or machines that dynamically adjust the clock speed to reduce power consumption or to protect the machine from overheating
    corecore